Search CORE

214 research outputs found

Highlights of the 1st IEEE Symposium on Biological Data Visualization

Author: Kennedy Jessie
Roerdink Jos
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

PubMed Central

University of Groningen Digital Archive

Dissertations of the University of Groningen

A Rule Based Taxonomy of Dirty Data

Author: . Jessie Kennedy
. Lin Li
. Taoxin Peng
Publication venue: GSTF Journal on Computing (JoC)
Publication date: 13/09/2014
Field of study

There is a growing awareness that high quality of datais a key to today’s business success and that dirty data existingwithin data sources is one of the causes of poor data quality. Toensure high quality data, enterprises need to have a process,methodologies and resources to monitor, analyze and maintainthe quality of data. Nevertheless, research shows that manyenterprises do not pay adequate attention to the existence of dirtydata and have not applied useful methodologies to ensure highquality data for their applications. One of the reasons is a lack ofappreciation of the types and extent of dirty data. In practice,detecting and cleaning all the dirty data that exists in all datasources is quite expensive and unrealistic. The cost of cleaningdirty data needs to be considered for most of enterprises. Thisproblem has not attracted enough attention from researchers. Inthis paper, a rule-based taxonomy of dirty data is developed. Theproposed taxonomy not only provides a mechanism to deal withthis problem but also includes more dirty data types than any ofexisting such taxonomies

GSTF Digital Library (GSTF-DL): Open Journal Systems (Global Science and Technology Forum)

A Comparison of Techniques for Name Matching

Author: . Jessie Kennedy
. Lin Li
. Taoxin Peng
Publication venue: GSTF Journal on Computing (JoC)
Publication date: 28/08/2014
Field of study

Information explosion is a problem for everyone nowadays. It is a great challenge to all kinds of businesses to maintain high quality of data in their information applications, such as data integration, text and web mining, information retrieval, search engine, etc. In such applications, matching names is one of the popular tasks. There are a number of name matching techniques available. Unfortunately, there is no existing name matching technique that performs the best in all situations. Therefore, a problem that every researcher or a practitioner has to face is how to select an appropriate technique for a given dataset. This paper analyses and evaluates a set of popular name matching techniques on several carefully designed different datasets. The experimental comparison confirms the statement that there is no clear best technique. Some suggestions have been presented, which can be used as guidance for researchers and practitioners to select an appropriate name matching technique in a given dataset

GSTF Digital Library (GSTF-DL): Open Journal Systems (Global Science and Technology Forum)

Supporting taxonomic names in cell and molecular biology databases.

Author: Kennedy Jessie
Publication venue: Mary Ann Liebert
Publication date: 01/01/2003
Field of study

Groups of organisms require labels or names to refer to them, however the idea of a single static name index, although tempting for its simplicity, is both impractical and unadvisable as a basis for referring to organisms for which data has been collected and stored for analyses and sharing. The relevant issues are described and some of the challenges facing database researchers are discussed

Repository@Napier

Visual Encodings for Networks with Multiple Edge Types

Author: Archambault Daniel
Bach Benjamin
Kennedy Jessie
Vogogias Athanasios
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

This paper reports on a formal user study on visual encodings ofnetworks with multiple edge types in adjacency matrices. Our tasksand conditions were inspired by real problems in computationalbiology. We focus on encodings in adjacency matrices, selectingfour designs from a potentially huge design space of visual encodings.We then settle on three visual variables to evaluate in acrowdsourcing study with 159 participants: orientation, positionand colour. The best encodings were integrated into a visual analyticstool for inferring dynamic Bayesian networks and evaluated bycomputational biologists for additional evidence.We found that theencodings performed differently depending on the task, however,colour was found to help in all tasks except when trying to find theedge with the largest number of edge types. Orientation generallyoutperformed position in all of our tasks

Edinburgh Research Explorer

Cronfa at Swansea University

Extending taxonomic visualisation to incorporate synonymy and structural markers.

Author: Graham Martin
Kennedy Jessie
Publication venue: SAGE Publications
Publication date: 01/01/2005
Field of study

The visualisation of taxonomic hierarchies has evolved from indented lists of names to techniques that can display thousands of nodes and onto hundreds of thousands of nodes over multiple taxonomies. However, challenges remain within multiple hierarchy visualisation, and for taxonomic hierarchy visualisation in particular. Firstly, at present, there is no support for handling specific taxonomic information such as synonymy, with current visualisations matching solely on names. Synonymy is extremely important as it reflects expert opinion on the compatibility of data held in separate taxonomies, and is needed to produce an accurate picture of taxonomic overlap. Also, current techniques for exploring large hierarchies find it difficult to convey internal reorganisations between hierarchies, with most systems showing only addition, removal or wide-ranging fragmentation of information between taxonomies. Finding the source of changes that have occurred within an existing structure is currently only achievable through exhaustive drill-down exploration. This paper describes work that tackles these problems, incorporating synonymy information into a model for multiple hierarchy visualisation of large taxonomies, and also detailing techniques that aid navigation for discovering structural re-organisations between hierarchies and for revealing information about nodes that lie below the effective display resolution of the hierarchy layout. Two examples on real taxonomic data sets are annotated to show the effectiveness of these techniques in operation

Repository@Napier

A survey of multiple tree visualisation.

Author: Graham Martin
Kennedy Jessie
Publication venue: SAGE Publications
Publication date: 05/11/2009
Field of study

This paper summarises the state-of-the-art in multiple tree visualisations. It discusses the spectrum of current representation techniques used on single trees, pairs of trees and finally multiple trees, in order to identify which representations are best suited to particular tasks and to find gaps in the representation space where opportunities for future multiple tree visualisation research may exist. The application areas from where multiple tree data are derived are enumerated, and the distinct structures that multiple trees make in combination with each other and the effect on subsequent approaches to their visualisation are discussed, along with the basic high-level goals of existing multiple tree visualisations

Repository@Napier

Exploring multiple trees through DAG representations

Author: Graham Martin
Kennedy Jessie
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2007
Field of study

We present a Directed Acyclic Graph visualisation designed to allow interaction with a set of multiple classification trees, specifically to find overlaps and differences between groups of trees and individual trees. The work is motivated by the need to find a representation for multiple trees that has the space-saving property of a general graph representation and the intuitive parent-child direction cues present in individual representation of trees. Using example taxonomic data sets, we describe augmentations to the common barycenter DAG layout method that reveal shared sets of child nodes between common parents in a clearer manner. Other interactions such as displaying the multiple ancestor paths of a node when it occurs in several trees, and revealing intersecting sibling sets within the context of a single DAG representation are also discussed

CiteSeerX

Repository@Napier

Visual Exploration of Alternative Taxonomies through Concepts

Author: Graham Martin
Kennedy Jessie
Publication venue: Elsevier
Publication date: 01/01/2007
Field of study

A graphical user interface is presented that allows users of taxonomic data to explore concept relationships between conflicting but related taxonomic classifications. Ecological analyses that use taxonomic metadata depend on accurate naming of specimens and taxa, and if the metadata involves several taxonomies, care has to be taken to match concepts between them. To perform this accurately requires expert-defined concept relationships, which are more complex yet more representative than the simple one-to-one mappings found through simple name matching, and can accommodate nomenclatural changes and differences in classification technique (cf ‘lumpers’ versus ‘splitters’). In the SEEK-Taxon (Scientific Environment for Ecological Knowledge) project we aim to help users of taxonomic datasets untangle and understand these relationships through a prototype visual interface which graphically displays these relationship structures, allowing users to comprehend such information and more accurately name their data

Repository@Napier